567 research outputs found
On the sphericity test with large-dimensional observations
In this paper, we propose corrections to the likelihood ratio test and John's
test for sphericity in large-dimensions. New formulas for the limiting
parameters in the CLT for linear spectral statistics of sample covariance
matrices with general fourth moments are first established. Using these
formulas, we derive the asymptotic distribution of the two proposed test
statistics under the null. These asymptotics are valid for general population,
i.e. not necessarily Gaussian, provided a finite fourth-moment. Extensive
Monte-Carlo experiments are conducted to assess the quality of these tests with
a comparison to several existing methods from the literature. Moreover, we also
obtain their asymptotic power functions under the alternative of a spiked
population model as a specific alternative.Comment: 37 pages, 3 figure
On singular values distribution of a large auto-covariance matrix in the ultra-dimensional regime
Let be a sequence of independent real random
vectors of -dimension and let
be the lag- (
is a fixed positive integer) auto-covariance matrix of . This
paper investigates the limiting behavior of the singular values of under
the so-called {\em ultra-dimensional regime} where and
in a related way such that . First, we show that the
singular value distribution of after a suitable normalization converges
to a nonrandom limit (quarter law) under the forth-moment condition.
Second, we establish the convergence of its largest singular value to the right
edge of . Both results are derived using the moment method.Comment: 32 pages, 2 figure
On Two Simple and Effective Procedures for High Dimensional Classification of General Populations
In this paper, we generalize two criteria, the determinant-based and
trace-based criteria proposed by Saranadasa (1993), to general populations for
high dimensional classification. These two criteria compare some distances
between a new observation and several different known groups. The
determinant-based criterion performs well for correlated variables by
integrating the covariance structure and is competitive to many other existing
rules. The criterion however requires the measurement dimension be smaller than
the sample size. The trace-based criterion in contrast, is an independence rule
and effective in the "large dimension-small sample size" scenario. An appealing
property of these two criteria is that their implementation is straightforward
and there is no need for preliminary variable selection or use of turning
parameters. Their asymptotic misclassification probabilities are derived using
the theory of large dimensional random matrices. Their competitive performances
are illustrated by intensive Monte Carlo experiments and a real data analysis.Comment: 5 figures; 22 pages. To appear in "Statistical Papers
Testing the Sphericity of a covariance matrix when the dimension is much larger than the sample size
This paper focuses on the prominent sphericity test when the dimension is
much lager than sample size . The classical likelihood ratio test(LRT) is no
longer applicable when . Therefore a Quasi-LRT is proposed and
asymptotic distribution of the test statistic under the null when
is well established in this paper.
Meanwhile, John's test has been found to possess the powerful {\it
dimension-proof} property, which keeps exactly the same limiting distribution
under the null with any -asymptotic, i.e. ,
. All asymptotic results are derived for general population
with finite fourth order moment. Numerical experiments are implemented for
comparison
Gaussian fluctuations for linear spectral statistics of large random covariance matrices
Consider a matrix ,
where is a nonnegative definite Hermitian matrix and is a random
matrix with i.i.d. real or complex standardized entries. The fluctuations of
the linear statistics of the eigenvalues are shown to be Gaussian, in the regime
where both dimensions of matrix go to infinity at the same pace and
in the case where is of class , that is, has three continuous
derivatives. The main improvements with respect to Bai and Silverstein's CLT
[Ann. Probab. 32 (2004) 553-605] are twofold: First, we consider general
entries with finite fourth moment, but whose fourth cumulant is nonnull, that
is, whose fourth moment may differ from the moment of a (real or complex)
Gaussian random variable. As a consequence, extra terms proportional to and
appear in the limiting variance and in the limiting bias, which not only
depend on the spectrum of matrix but also on its eigenvectors. Second, we
relax the analyticity assumption over by representing the linear statistics
with the help of Helffer-Sj\"{o}strand's formula. The CLT is expressed in terms
of vanishing L\'{e}vy-Prohorov distance between the linear statistics'
distribution and a Gaussian probability distribution, the mean and the variance
of which depend upon and and may not converge.Comment: Published at http://dx.doi.org/10.1214/15-AAP1135 in the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Modeling extreme values of processes observed at irregular time steps: Application to significant wave height
This work is motivated by the analysis of the extremal behavior of buoy and
satellite data describing wave conditions in the North Atlantic Ocean. The
available data sets consist of time series of significant wave height (Hs) with
irregular time sampling. In such a situation, the usual statistical methods for
analyzing extreme values cannot be used directly. The method proposed in this
paper is an extension of the peaks over threshold (POT) method, where the
distribution of a process above a high threshold is approximated by a
max-stable process whose parameters are estimated by maximizing a composite
likelihood function. The efficiency of the proposed method is assessed on an
extensive set of simulated data. It is shown, in particular, that the method is
able to describe the extremal behavior of several common time series models
with regular or irregular time sampling. The method is then used to analyze Hs
data in the North Atlantic Ocean. The results indicate that it is possible to
derive realistic estimates of the extremal properties of Hs from satellite
data, despite its complex space--time sampling.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS711 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …